Is writing style predictive of scientific fraud?

نویسندگان

  • Chloé Braud
  • Anders Søgaard
چکیده

The problem of detecting scientific fraud using machine learning was recently introduced, with initial, positive results from a model taking into account various general indicators. The results seem to suggest that writing style is predictive of scientific fraud. We revisit these initial experiments, and show that the leave-one-out testing procedure they used likely leads to a slight over-estimate of the predictability, but also that simple models can outperform their proposed model by some margin. We go on to explore more abstract linguistic features, such as linguistic complexity and discourse structure, only to obtain negative results. Upon analyzing our models, we do see some interesting patterns, though: Scientific fraud, for examples, contains less comparison, as well as different types of hedging and ways of presenting logical reasoning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

How to improve English articles writing methods

Introduction: Today, English and its use as an international language is agreed upon by all, and a wealth of scientific articles written every day in the whole world is in English. The purpose of this study is to project some commonly used grammatical points that native speakers of Persian, unfortunately, do not follow or do not have adequate knowledge to use when writing in English, but this s...

متن کامل

Stylometry-based Fraud and Plagiarism Detection for Learning at Scale

Fraud detection in free and natural text submissions is a major challenge for educators in general. It is even more challenging to detect plagiarism at scale and in online classes such as Massive Open Online Courses. In this paper, we introduce a novel method that analyses the writing style of an author (stylometry) to identify plagiarism. We will show that our system scales to thousands of sub...

متن کامل

The study and recognition of artistic dyes in the Islamic period of Iran in writing and painting (Based on poetry of Khorasanid style poets)

The main features of Iranian painting in the post-Islamic centuries are the association with Persian literature. Persian literature and Persian art have intrinsic links, since the artist and poet are based on the unit's vision, rooted in a culture and intellectual space, to create. The result of this poet's creation is a literary work, and this work can have all the features of the work of art....

متن کامل

Exploring Stylistic Variation with Age and Income on Twitter

Writing style allows NLP tools to adjust to the traits of an author. In this paper, we explore the relation between stylistic and syntactic features and authors’ age and income. We confirm our hypothesis that for numerous feature types writing style is predictive of income even beyond age. We analyze the predictive power of writing style features in a regression task on two data sets of around ...

متن کامل

Linguistic Traces of a Scientific Fraud: The Case of Diederik Stapel

When scientists report false data, does their writing style reflect their deception? In this study, we investigated the linguistic patterns of fraudulent (N  =  24; 170,008 words) and genuine publications (N  =  25; 189,705 words) first-authored by social psychologist Diederik Stapel. The analysis revealed that Stapel's fraudulent papers contained linguistic changes in science-related discourse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1707.04095  شماره 

صفحات  -

تاریخ انتشار 2017